61 research outputs found

    Screening of Obstructive Sleep Apnea with Empirical Mode Decomposition of Pulse Oximetry

    Full text link
    Detection of desaturations on the pulse oximetry signal is of great importance for the diagnosis of sleep apneas. Using the counting of desaturations, an index can be built to help in the diagnosis of severe cases of obstructive sleep apnea-hypopnea syndrome. It is important to have automatic detection methods that allows the screening for this syndrome, reducing the need of the expensive polysomnography based studies. In this paper a novel recognition method based on the empirical mode decomposition of the pulse oximetry signal is proposed. The desaturations produce a very specific wave pattern that is extracted in the modes of the decomposition. Using this information, a detector based on properly selected thresholds and a set of simple rules is built. The oxygen desaturation index constructed from these detections produces a detector for obstructive sleep apnea-hypopnea syndrome with high sensitivity (0.8380.838) and specificity (0.8550.855) and yields better results than standard desaturation detection approaches.Comment: Accepted in Medical Engineering and Physic

    Discovering network relations in big time series with application to bioinformatics

    Get PDF
    Big Data concerns large-volume, complex and growing data sets, with multiple and autonomous sources. It is now rapidly expanding in all science and engineering domains. Time series represent an important class of big data that can be obtained from several applications, such as medicine (electrocardiogram), environmental (daily temperature), financial (weekly sales totals, and prices of mutual funds and stocks), as well as from many areas, such as socialnetworks and biology. Bioinformatics seeks to provide tools and analyses that facilitate understanding of living systems, by analyzing and correlating biological information. In particular, as increasingly large amounts of genes information have become available in the last years, more efficient algorithms for dealing with such big data in genomics are required. There is an increasing interest in this field for the discovery of the network of regulations among a group of genes, named Gene Regulation Networks (GRN), by analyzing the genes expression profiles represented as timeseries. In it has been proposed the GRNNminer method, which allows discovering the subyacent GRN among a group of genes, through the proper modeling of the temporal dynamics of the gene expression profiles with artificial neural networks. However, it implies building and training a pool of neural models for each possible gentogen relationship, which derives in executing a very large set of experiments with O( n 2 ) order, where n is the total of involved genes. This work presents a proposal for dramatically reducing such experiments number to O( (n/k)2 ) when big timeseries is involved for reconstructing a GRN from such data, by previously clustering genes profiles in k groups using selforganizing maps (SOM). This way, the GRNNminer can be applied over smaller sets of timeseries, only those appearing in the same cluster.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Cluster Ensembles for Big Data Mining Problems

    Get PDF
    Mining big data involves several problems and new challenges, in addition to the huge volume of information. One the one hand, these data generally come from autonomous and decentralized sources, thus its dimensionality is heterogeneous and diverse, and generally involves privacy issues. On the other hand, algorithms for mining data such as clustering methods, have particular characteristics that make them useful for different types of data mining problems. Due to the huge amount of information, the task of choosing a single clustering approach becomes even more difficult. For instance, k-means, a very popular algorithm, always assumes spherical clusters in data; hierarchical approaches can be used when there is interest in finding this type of structure; expectationmaximization iteratively adjusts the parameters of a statistical model to fit the observed data. Moreover, all these methods work properly only with relatively small data sets. Large-volume data often make their application unfeasible, not to mention if data come from autonomous sources that are constantly growing and evolving. In the last years, a new clustering approach has emerged, called consensus clustering or cluster ensembles. Instead of running a single algorithm, this approach produces, at first, a set of data partitions (ensemble) by employing different clustering techniques on the same original data set. Then, this ensemble is processed by a consensus function, which produces a single consensus partition that outperforms individual solutions in the input ensemble. This approach has been successfully employed for distributed data mining, what makes it very interesting and applicable in the big data context. Although many techniques have been proposed for large data sets, most of them mainly focus on making individual components more efficient, instead of improving the whole consensus approach for the case of big data.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    A Novel Method to Control the Diversity in Cluster Ensembles

    Get PDF
    Clustering is fundamental to understand the structure of data. In the past decade the cluster ensemble problem has been introduced, which combines a set of partitions (an ensemble) of the data to obtain a single consensus solution that outperforms all the ensemble members. Although disagreement among ensemble partitions (diversity) has been found to be fundamental for success, the literature has arrived to confusing conclusions: some authors suggest that high diversity is beneficial for the final performance, whereas others have indicated that medium is better. While there are several options to measure the diversity, there is no method to control it. This paper introduces a new ensemble generation strategy and a method to smoothly change the ensemble diversity. Experimental results on three datasets suggest that this is an important step towards a more systematic approach to analyze the impact of the ensemble diversity on the overall consensus performance.Sociedad Argentina de Informática e Investigación Operativ

    Cluster Ensembles for Big Data Mining Problems

    Get PDF
    Mining big data involves several problems and new challenges, in addition to the huge volume of information. One the one hand, these data generally come from autonomous and decentralized sources, thus its dimensionality is heterogeneous and diverse, and generally involves privacy issues. On the other hand, algorithms for mining data such as clustering methods, have particular characteristics that make them useful for different types of data mining problems. Due to the huge amount of information, the task of choosing a single clustering approach becomes even more difficult. For instance, k-means, a very popular algorithm, always assumes spherical clusters in data; hierarchical approaches can be used when there is interest in finding this type of structure; expectationmaximization iteratively adjusts the parameters of a statistical model to fit the observed data. Moreover, all these methods work properly only with relatively small data sets. Large-volume data often make their application unfeasible, not to mention if data come from autonomous sources that are constantly growing and evolving. In the last years, a new clustering approach has emerged, called consensus clustering or cluster ensembles. Instead of running a single algorithm, this approach produces, at first, a set of data partitions (ensemble) by employing different clustering techniques on the same original data set. Then, this ensemble is processed by a consensus function, which produces a single consensus partition that outperforms individual solutions in the input ensemble. This approach has been successfully employed for distributed data mining, what makes it very interesting and applicable in the big data context. Although many techniques have been proposed for large data sets, most of them mainly focus on making individual components more efficient, instead of improving the whole consensus approach for the case of big data.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    A Novel Method to Control the Diversity in Cluster Ensembles

    Get PDF
    Clustering is fundamental to understand the structure of data. In the past decade the cluster ensemble problem has been introduced, which combines a set of partitions (an ensemble) of the data to obtain a single consensus solution that outperforms all the ensemble members. Although disagreement among ensemble partitions (diversity) has been found to be fundamental for success, the literature has arrived to confusing conclusions: some authors suggest that high diversity is beneficial for the final performance, whereas others have indicated that medium is better. While there are several options to measure the diversity, there is no method to control it. This paper introduces a new ensemble generation strategy and a method to smoothly change the ensemble diversity. Experimental results on three datasets suggest that this is an important step towards a more systematic approach to analyze the impact of the ensemble diversity on the overall consensus performance.Sociedad Argentina de Informática e Investigación Operativ

    Discovering network relations in big time series with application to bioinformatics

    Get PDF
    Big Data concerns large-volume, complex and growing data sets, with multiple and autonomous sources. It is now rapidly expanding in all science and engineering domains. Time series represent an important class of big data that can be obtained from several applications, such as medicine (electrocardiogram), environmental (daily temperature), financial (weekly sales totals, and prices of mutual funds and stocks), as well as from many areas, such as socialnetworks and biology. Bioinformatics seeks to provide tools and analyses that facilitate understanding of living systems, by analyzing and correlating biological information. In particular, as increasingly large amounts of genes information have become available in the last years, more efficient algorithms for dealing with such big data in genomics are required. There is an increasing interest in this field for the discovery of the network of regulations among a group of genes, named Gene Regulation Networks (GRN), by analyzing the genes expression profiles represented as timeseries. In it has been proposed the GRNNminer method, which allows discovering the subyacent GRN among a group of genes, through the proper modeling of the temporal dynamics of the gene expression profiles with artificial neural networks. However, it implies building and training a pool of neural models for each possible gentogen relationship, which derives in executing a very large set of experiments with O( n 2 ) order, where n is the total of involved genes. This work presents a proposal for dramatically reducing such experiments number to O( (n/k)2 ) when big timeseries is involved for reconstructing a GRN from such data, by previously clustering genes profiles in k groups using selforganizing maps (SOM). This way, the GRNNminer can be applied over smaller sets of timeseries, only those appearing in the same cluster.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    A method for daily normalization in emotion recognition

    Get PDF
    A ffects carry important information in human communication and decision making, and their use in technology have grown in the past years. Particularly, emotions have a strong e ect on physiology, which can be assessed by biomedical signals. This signals have the advantage that can be recorded continuously, but also can become intrusive. The present work introduce an emotion recognition scheme based only in photoplethysmography, aimed to lower invasiveness. The feature extraction method was developed for a realistic real-time context. Furthermore, a feature normalization procedure was proposed to reduce the daily variability. For classi cation, two well-known models were compared. The proposed algorithms were tested on a public database, which consist of 8 emotions expressed continuously by a single subject along diff erent days. Recognition tasks were performed for several number of emotional categories and groupings. Preliminary results shows a promising performance with up to 3 emotion categories. Moreover, the recognition of arousal and emotional events was improved for larger emotion sets.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Multi-center anatomical segmentation with heterogeneous labels via landmark-based models

    Full text link
    Learning anatomical segmentation from heterogeneous labels in multi-center datasets is a common situation encountered in clinical scenarios, where certain anatomical structures are only annotated in images coming from particular medical centers, but not in the full database. Here we first show how state-of-the-art pixel-level segmentation models fail in naively learning this task due to domain memorization issues and conflicting labels. We then propose to adopt HybridGNet, a landmark-based segmentation model which learns the available anatomical structures using graph-based representations. By analyzing the latent space learned by both models, we show that HybridGNet naturally learns more domain-invariant feature representations, and provide empirical evidence in the context of chest X-ray multiclass segmentation. We hope these insights will shed light on the training of deep learning models with heterogeneous labels from public and multi-center datasets

    A method for daily normalization in emotion recognition

    Get PDF
    A ffects carry important information in human communication and decision making, and their use in technology have grown in the past years. Particularly, emotions have a strong e ect on physiology, which can be assessed by biomedical signals. This signals have the advantage that can be recorded continuously, but also can become intrusive. The present work introduce an emotion recognition scheme based only in photoplethysmography, aimed to lower invasiveness. The feature extraction method was developed for a realistic real-time context. Furthermore, a feature normalization procedure was proposed to reduce the daily variability. For classi cation, two well-known models were compared. The proposed algorithms were tested on a public database, which consist of 8 emotions expressed continuously by a single subject along diff erent days. Recognition tasks were performed for several number of emotional categories and groupings. Preliminary results shows a promising performance with up to 3 emotion categories. Moreover, the recognition of arousal and emotional events was improved for larger emotion sets.Sociedad Argentina de Informática e Investigación Operativa (SADIO
    corecore